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Abstract: Dust storms in arid and desert areas affect radiation budget, air quality, visibility, enzymatic 
activities, agricultural products and human health. Due to increased drought and land use changes in 
recent years, the frequency of dust storms occurrence in Iran has been increased. This study aims to 
identify dust source areas in the Sistan watershed (Iran-Afghanistan borders)—an important regional source 
for dust storms in southwestern Asia, using remote sensing (RS) and bivariate statistical models. 
Furthermore, this study determines the relative importance of factors controlling dust emissions using 
frequency ratio (FR) and weights of evidence (WOE) models and interpretability of predictive models 
using game theory. For this purpose, we identified 211 dust sources in the study area and generated a dust 
source distribution map—inventory map—by dust source potential index based on RS data. In addition, 
spatial maps of topographic factors affecting dust source areas including soil, lithology, slope, Normalized 
difference vegetation index (NDVI), geomorphology and land use were prepared. The performance of 
two models (WOE and FR) was evaluated using the area under curve (AUC) of the receiver operating 
characteristic curve. The results showed that soil, geomorphology and slope exhibited the greatest 
influence in the dust source areas. The 55.3% (according to FR) and 62.6% (according to WOE) of the 
total area were classified as high and very high potential dust sources, while both models displayed 
acceptable accuracy with subsurface levels of 0.704 for FR and 0.751 for WOE, although they predict 
different fractions of dust potential classes. Based on Shapley additive explanations (SHAP), three factors, 
i.e., soil, slope and NDVI have the highest impact on the model's output. Overall, combination of 
statistic-based predictive models (or data mining models), RS and game theory techniques can provide 
accurate maps of dust source areas in arid and semi-arid regions, which can be helpful for mitigation of 
negative effects of dust storms. 
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1 Introduction 


Atmospheric dust is the most abundant aerosol type over land, which affects climate, water cycle, 
cloud, solar radiation, terrestrial and marine ecosystems, public health and welfare (Soltani et al., 
2015; Raspanti et al., 2016; Stafoggia et al., 2016; Shahsavani et al., 2020; Suresh et al., 2021). 
Dust storms usually occur in arid and desert areas with annual precipitation <200 mm 
(Engelstaedter et al., 2006; Indoitu et al., 2012) and these phenomena are especially frequent in 
the Sahara and Middle East facilitated by local cyclonicity and pressure gradients that trigger 
intense local winds (Schepanski et al., 2012; Yu et al., 2015, 2016; Francis et al., 2019, 2021). 
More than 2x10? people live in areas, which are exposed to the phenomenon of dust storms, with 
serious deleterious impacts on socio-economic and health (Middleton, 2017). During the last 
decades, due to climate change, global warming and expanded desertification, dust aerosols have 
increased in arid and semi-arid regions of the Middle East (Notaro et al., 2015; Klingmiiller et al., 
2016; Middleton, 2019; Shaheen et al., 2020; Boloorani et al., 2021; Modarres, 2021; Rashki et 
al., 2021)). The Middle East is the most important dust source after North Africa, while the 
maximum dust emission occurs in summer. About 90% dust of the Middle East is produced in the 
Tigris-Euphrates floodplain, the Syrian-Iraqi desert, the deserts in northern, eastern and southern 
Saudi Arabia and the Sistan watershed in eastern Iran (Cao et al., 2015; Boroughani et al., 2020). 

RS is one of the most important techniques for identifying dust source areas around the world 
(Baddock et al., 2011; Bilal et al., 2014; Jiao et al., 2021). Therefore, many studies have used 
several satellite sensors, RS techniques and data mining models to identify dust source areas and 
dust emissions over the world (Miller, 2003; Lee et al., 2009; Zhang et al., 2010; Liu et al., 2011; 
Miller et al., 2012; Park et al., 2014; Namdari et al., 2018; Soni et al., 2018; Boroughani et al., 
2019; Gholami et al., 2020a). Most studies used the moderate-resolution imaging 
spectroradiometer (MODIS) observations and developed dust detection indices (brightness 
temperature difference (BTD2931 and BTD3132) NDVI, D-parameters and over-land dust 
enhancement technique) to identify dust source areas at different regions of the world. The 
identification and mapping of dust source can be considered as the first step to manage and reduce 
dust effects on environment and human health (Emamian et al., 2021). Data mining, as a typical 
technique for mapping environmental and natural hazards, is a way to discover new and 
potentially useful information from a large amount of data (Yilmaz, 2009; Gholami et al., 2020b). 
Numerous studies have used data mining with different methods dealing with identification of 
land susceptibility to dust sources, landslides, wind erosion hazard, subsidence, fires, PMio 
concentrations, groundwater, gully erosion and floods and have addressed the potential mapping 
of these factors around the world (Dube et al., 2014; Chen et al., 2015; Motevalli et al., 2019; 
Boroughani et al., 2020; Lee et al., 2021). 

Sistan watershed is one of the most important dust source areas in southwestern Asia 
(Kaskaoutis et al., 2016). Many studies have analyzed different aspects of dust storms in this area, 
especially in the basin of Iran, including health issues (Behrooz et al., 2021; Javan, et al., 2021), 
geomorphology (Evenstar et al., 2018), climatology (Kaskaoutis et al., 2018; Hamidianpour et al., 
2021), water resources (Mianabadi et al., 2021), dust provenance (Behrooz et al., 2019), land 
susceptibility to dust emissions (Gholami et al., 2020c), mineralogy and geochemistry of dust 
(Rashki et al., 2013). Dust storms in the Sistan watershed mainly originated from the dried playas 
of the Hamoun Lake, with high possibility of saline dust storms due to high contents of remaining 
salts in the lake beds after dryness (Rashki et al., 2013; Behrooz et al., 2019). Gholami et al. 
(2020c) applied several data mining models for classifying land susceptibility to dust emissions in 
the Sistan watershed and they reported that most of the study area is classified as high and very 
high susceptibility levels for dust emissions. Namdari et al. (2021) examined the relationship of 
dust storms with wind speed and vegetation cover in the Sistan watershed using RS data and 
found a positive feedback of wind and sparse vegetation to dust emissions. Recently, 
Ebrahimi-Khusfi et al. (2021) predicted the number of dusty days in the Sistan watershed by 
means of feature selection and machine learning techniques, with satisfactory results about the 
model's performance. Most previous studies in the Sistan watershed have focused on the dried-up 
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part of the Hamoun Lake, without covering the whole watershed region. Furthermore, in these 
studies, only machine learning techniques have been used, without examining all the factors 
affecting dust source areas. The present study examines the whole Sistan watershed using RS and 
machine learning techniques and determines the importance of various topographic factors that 
influence the establishment of dust source areas. 

In this study, RS techniques, data mining models like FR, WOE and game theory were 
synergized to identify dust source areas. Potential map of dust sources in the whole Sistan 
watershed, including parts of Iran and Afghanistan were plotted. MODIS satellite images were 
processed using various RS techniques to identify dust source areas—inventory map-in the study 
domain. Two statistical algorithms (FR and WOE) were used to prepare potential maps of dust 
sources, while the accuracy of identified dust sources was confirmed through a field survey. The 
performance of the models was evaluated using the AUC and the contributing variables for 
mapping dust potential. Finally, the game theory through SHAP and permutation feature 
importance measure (PFIM) was applied to address the interpretability of predictive models. 


2 Study area and methods 
2.1 Study area 


The study area used here is the whole Sistan watershed, including Helmand River (in Afghanistan) 
and Sistan and Baluchestan (in Iran) watersheds, with a total area of about 148,779 km? (Fig. 1). 

The most important cities in the study area include Zahedan, Zabol in Iran and Zaranj in 

Afghanistan. The region is characterized by an arid/desert climate with an intense seasonal 

(summer) north wind (Levar wind), summer thermal cyclones and westerly winter flows (Abbasi 

et al., 2019; Hamidianpour et al., 2021). The average annual precipitation in the study area is 

about 60 to 130 mm. The temperature reaches about 50°C in summer and drops to —7°C to —8°C 

in winter, with annual average temperature of 21°C, relative humidity of 38% and potential 

evapotranspiration ranging from 4196 to 5700 mm (Kaskaoutis et al., 2015). Helmand River and 

Hamoun Lake are the main surface water sources in the Sistan watershed (Mianabadi et al., 2021). 
However, after the construction of a dam in the Helmand River and the occurrence of frequent 

droughts, especially the 2000s, the wetland has dried up and resulted in a dramatic increase in the 

occurrence of dust storms (Rashki et al., 2013b). 
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Fig. 1 Location (a) and elevation (b) maps of the Sistan watershed in Iran and Afghanistan 


2.2 Methods 


2.2.1 Datasets 

In this research, the required data were obtained from topographic maps of 1:25,000 and 1:50,000 
scales, geological maps of 1:100,000 and 1:250,000 scales (https://gsi.ir), satellite images, field 
studies and library resources. 
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2.2.2 Inventory map of dust sources 
The distribution map of the dust source areas was prepared using MODIS imagery from Terra 
(morning) and Aqua (afternoon) satellites (Vickery and Eckardt, 2013). The Terra and Aqua 
imageries correspond to the selected dust days from 2010 to 2019, which were determined based 
on meteorological data such as visibility, wind speed and direction at the Zabol station (Rashki et 
al., 2018). Then, the days when dust occurrence coincides with satellite imagery are determined. 
Using RS techniques and parameters that track dust aerosols (brightness temperature difference 
between MODIS bands 31 and 32 (BTD3132), bands 29 and 31 (BTD2931), normalized difference 
dust index (NDDT) and D), we identified the dust source areas (Bonham-Carter, 1994; Ackerman, 
1997; Sankey et al., 2013). Therefore, four indicators BTD3132, BTD2031, NDDI and D for 
detection of dust from MODIS imagery were used in this research. The brightness temperature 
(BT) is expressed as: 

2hc? 
As he í 

(eAkt —1) 
where BT(T, 4) represents the Planck equation at / (um); h is the Planck's constant (6.626 10-*4 
J-s); k is the Boltzmann's constant (1.38<10-%)°; e is the mathematical constant approximately 
equal to 2.71828 (known as the Euler's number); c is the speed of light (2.99x10* m/s); and t is 
the temperature (°C) (Hao et al., 2007). Using Planck equation, we can derive the value of 
temperature: 
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where L is the amount of radiance in the images (W/(m*-sr-m)). In addition, the NDDI is given 
via the following formula: 


NDDI = Z213 P0469 , (3) 
P2.13 + Po.469 
where p213 and po.469 are the reflectance values at the top-of-atmosphere at 2.13 and 0.469 um, 
respectively (Qu et al., 2006). Finally, D is given by: 


D=exp{-[rrxa+(BTD-b)]}, (4) 


where rr represents the reflectance proportion among the wavelengths of 0.54 and 0.86 um; BTD 
is the difference among the bands 11 and 12 (um); a and b are constants obtained during the 
calibration of the first stage (Miller, 2003; Qu et al., 2006; Hao et al., 2007; Boroughani et al., 
2020). Finally, for dust detection, a false color combination (FCC) was used using four indicators 
(BTD2931, BTD3132, NDDI and D) and bands 3 and 4 of MODIS imagery. In this respect, four 
methods to generate the FCC (i.e., method 1: NDDI, B4 and B3; method 2: D, BTD2931 and NDDI; 
method 3: D, BTD3132 and NDDI; and method 4: BTD2931, B4 and B3) were employed for 
reconstruction of dust sources using the MODIS satellite images. The method of detecting 
dust-source areas in this study was based on the Gaussian-Plum diffusion model (Lee et al., 2009). 
When the dust is formed, the starting point of the dust plume covers a narrow area and the time 
progressing dust is spread and takes a conical shape continuing to expand (Lee et al., 2009). 
Finally, an inventory map of the dust sources in the Sistan watershed was prepared. 

2.2.3 Factors controlling dust sources 


Identifying the factors influencing the occurrence of dust events is the first step, while the 
selection of the important factors has a great role in the accuracy of dust source mapping 
(Gholami et al., 2020a). In this study, after extensive field surveys in the study area, 6 
topographic factors including soil, lithology, slope, NDVI, geomorphology units and land use 
were identified as effective for dust sources. These factors, among others, have been also 
recognized as effective for land susceptibility, wind erosion and dust emissions in the Sistan 
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watershed by previous studies (Gholami et al., 2020c; Ebrahimi-Khusfi et al., 2021). On the other 
hand, Sistan watershed is considered as one of the dustiest and windiest arid environments over 
the globe (Abbasi et al., 2019; Rashki et al., 2021), where the summer Levar wind strongly 
modulates the dust activity (Kaskaoutis et al., 2018; Hamidianpour et al., 2021). However, 
measured wind-speed data were not available at the stations in Afghanistan and, therefore, we 
limited the research by examining the effect of topographic variables. Effect of wind on land 
susceptibility to wind erosion was previously examined in southeastern Iran (Kerman Province) 
with availability of wind speed data (Gholami et al., 2021). 

The land use that is characterized by low vegetation cover or regions that have undergone 
either natural or anthropogenic impact, i.e., poor vegetation or removed soil crust, have a higher 
potential to dust emissions (Goossens and Buck, 2014; Parajuli and Zender, 2017; Gholami et al., 
2021). In this study, Landsat 8 satellite imagery was used to prepare the land use map. These 
images were downloaded from the United States Geological Survey website 
(https://earthexplorer.usgs.gov/). Most of the dust storms in the study area occurred between May 
and July, so satellite images in these months were downloaded. Land use map for 2018 was 
prepared using the environment for visualizing images software. 

Plains and flat terrains with low slope gradients, especially less than 5%, have a higher 
potential to become dust sources than gravel lands, steep slopes and elevated terrains (Lee et al., 
2009; Emamian et al., 2021). In these areas, due to the flatten terrain, the wind velocity easily 
reaches and overcomes the threshold for wind erosion, causing the soil to be raised and dust 
storms to occur (Lee et al., 2009). Digital elevation model with a spatial resolution 30 m x30 m 
was used in the geographical information systems (GIS) software to prepare the slope map of the 
Sistan watershed. The slope ranges between 0% and 32% in the study area. 

Geology is considered as a fundamental variable for susceptibility mapping of dust sources 
(Gholami et al., 2020b), since it represents the erodibility of land and ground surface 
characteristics. The susceptible units have a crucial role in creating dust source areas relative to 
other resistant units and soil types (Sissakian et al., 2013). A geological map of the area was 
prepared using a 1:250,000 scale, while the soil map of the Sistan watershed was prepared using 
the 1:250,000 scale. The soil in the study area is highly sensitive to wind erosion (Gholami et al., 
2020b) and also lacks of organic matter, being not suitable for agriculture. 

To assess the vegetation cover in the study area, we used the NDVI. Vegetation in the Sistan 
watershed is very sparse due to low rainfall, excessive use of natural resources, drought and 
severe dust storms. NDVI shows the pattern of vegetation cover that can be associated with the 
dust source areas (Dawelbait and Morari, 2012; Kharol et al., 2013) and was used to prepare the 
vegetation map of the region. For this purpose, ETM+ images of the Landsat 8 satellite (2018) 
were used. 

Geomorphologic characteristics of dust-blown source regions contribute to the amount and type 
of mineral aerosols emitted into the air (Lee et al., 2012; Rashki et al., 2013). In order to prepare 
the map of geomorphological units of the region, we initially used slope and topographic maps 
with a scale of 1:50,000 and the geological map of the study area. In the next step, from the 
interpretation of Google Earth and Landsat 8 satellite images in 2018, more detailed information 
was extracted and transferred to the original map. Then, by completing the field information and 
final control, we prepared a map with geomorphology units of the region. By combining the dust 
distribution map with the effective factor maps, we obtained the number of dust source pixels in 
each class from the study layers, using WOE and FR models in R statistical software. 

2.2.4 Predictive models for mapping dust source 

For examining the dust source potential (Akbari et al., 2017), dust source areas were randomly 
divided into two categories for training (70%) and validation (30%) datasets. The training data 
were used to determine the weight and prepare the potential map using two probabilistic models 
(WOE and FR), while the validation data were used to validate the potential dust source map. In 
order to determine each class of effective factors in creating the dust source areas, we classified 
all layers of effective factors for modeling. By overlapping the position of each dust source with 
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each of the effective factors, we identified the statistical relationship between them and then, by 
obtaining the frequency of dust pixels of each class of layers and total frequency, we calculated 
the density of dust source in each class. Using the frequency of dust pixels and total frequency, 
we obtained the dust source density in the study area. Finally, we computed the weighting of each 
class based on probabilistic models of WOE and FR. 

WOE probability model is one of the two-variable statistical methods that use the Bayesian 
probabilistic method to determine the relative importance of effective factors by statistical tools 
using the linear logarithm entry form. By overlapping the position of each dust source with each 
of the effective factors, we identified the statistical relationship between them. In parallel, the 
degree that each variable contributes to the dust-source area is evaluated. We defined the 
probabilistic model of WOE based on the calculation of positive (W`) and negative (W°) weights. 
In this model, the weight calculation for the presence or absence of any dust source prediction 
factor (F or F*), based on the presence or absence of dust source (DS or DS") in the study area, is 
as follows: 


gers (5) 
Gs) 
DS 
a 
DS 
W` =log. —~—~. (6) 
P| E, 
DS 


where P is the probability of occurrence; F and F” are the presence or absence of factors affecting 
the formation of dust sources, respectively; DS are the presence of dust sources or the absence of 
dust sources (DS*). In this model, weight values usually have a range between positive and 
negative numbers, which indicates the role of more or fewer variables in the formation of dust 
sources. A positive weight (W*) indicates that there is an effective factor at the dust sources, 
whereas the negative weight (W ) shows that the examined factor does not affect the location of 
the dust sources, leading to a negative correlation. The difference between the positive and 
negative weights (C) indicates the magnitude of the relationship between the effective factor and 
the dust sources. Equations 7, 8 and 9 are used to obtain the final weight: 


C=[(W*)-W ), (7) 
sD = s? (w*)+ s? (W5), (8) 
Wrin = C /SD, (9) 


where S is the variances of (W ) and (W`); SD is the standard deviation, which is equal to the 
square root of the variance of each of the positive and negative weights; and Wrinai (final weight) 
is standardized and used to zone the dust potential sources. By entering weights into the 
information layer map, weighted thematic maps are obtained. The dust source potential index 
(DSPI) was calculated from the sum of these weights (Eq. 10). 

DSPI =È (Wrin) (i=1, 2, 3,....7). (10) 


FR model is a simple spatial evaluation tool for identifying probabilistic relationships between 
dependent and independent variables (Bonham-Carter, 1994). FR determines a simple 
relationship between the occurrence of a dust source and the various affecting variables. In 
determining FR, the ratio of dust source areas in each class to the total dust source areas is 
obtained from the influencing factors and the ratio of the level of each class to the total areas is 
also calculated. The calculation of FR for each class of factors affecting the dust source areas is 
shown in Equation 11. 
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R= ala A (11) 
C/D F 

where A is the number of pixels of dust source areas for each of the effective factor classes; B is 
the total number of dust source areas occurred in the study area; C the number of pixels in each 
of the effective factor classes; D is the total number of pixels in the study area; E is the 
percentage of dust source areas in each subclass of effective factors (%); and F is the relative 
percentage of the area of each subclass to the total area (%). To obtain the DSPI, we added the 
results of the factors together in the GIS software (Eq. 12). 


DSPI=}(FR); (i=1, 2, 3,...,7). (12) 


Finally, the weight of each of the 6 effective factor classes was assessed. The information 
layer of the values of potential factors to the dust source areas was obtained from their algebraic 
sum and finally, the dust potential map was prepared on 4 floors. 


2.2.5 Assessment of predictive models 


The potential maps obtained using FR and WOE models are tested and validated by calculating 
relative operating characteristics (ROC). In this method, the AUC takes values between 0.5 and 
1.0 and was used to evaluate the accuracy of models (Nandi and Shakoor, 2010). The best 
models exhibit an AUC value close to 1.0, while values close to 0.5 indicate high inaccuracies 
in the model performance (Floyd and Gill, 2011). To evaluate the dust source potential maps 
obtained using WOE and FR models, we used the 30% of dust source areas, corresponding to 
the validation dataset. Then, the accuracy of the prepared potential maps was evaluated using 
ROC curve. 

2.2.6 Game theory for assessing the interpretability of predictive models 

In addition, PFIM, suggested by Breiman (2001), was applied to determine the relative 
importance of effective factors for dust emissions. PFIM measures the importance of a feature 
by calculating the increase in the model's prediction error after permutating the feature. SHAP 
was used to interpret the models for predicting land susceptibility to dust emissions in the 
Sistan watershed (Lundberg and Lee, 2017). SHAP, a method based on game theory 
theoretically optimal Shapley values, may explain the individual prediction. SHAP explains the 
prediction of an instance x by computing the contribution of each feature to the prediction 
(Mohammadifaer et al., 2021). SHAP values can, therefore, estimate the expected marginal 
contribution of a factor among all possible contributions (Shapley, 2016). Figure 2 shows the 
whole process of effective factors selection, dust source mapping and evaluation of the models, 
which practically corresponds to the flow chart. 


3 Results and discussion 


3.1 Mapping inventory of dust source 


A total of 211 dust source areas were identified throughout the examined region, 61 located in 
Iran and 150 in Afghanistan (Boroughani, 2020). The analysis was based on the results of 
previous researches that identified dust source areas at different parts of the world (Walker et 
al., 2009; Miller et al., 2012; Boroughani et al., 2020). Figure 3 shows an example of dust storm 
detection from MODIS FCC satellite imagery over the Sistan watershed on 19 September 2014. 
In this FCC image, different dust plumes originated from the Hamoun Lake spreading 
south/southeastward and driven from thermal cyclonic lows usually developed over the Sistan 
watershed during summer due to the very high surface temperature (Kaskaoutis et al., 2015; 
Behrooz et al., 2019). 

We carried out field visits at 4 of the identified dust source areas in order to verify the dust 
sources obtained via RS method. Figure 4 shows the areas that were investigated from 12 to 16 
July 2019, all investigation is within the Iranian territory. During field visits, four variables 
including topographic properties, vegetation type, soil sensitivity to wind erosion and soil 
particle size were investigated. Same figure also includes photos taken at the 7 examined areas 
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Fig. 2 Flow chart of the dust source mapping procedure in the Sistan watershed and model evaluation. NDVI, 
normalized difference vegetation index; FR, frequency ratio; WOE, weights of evidence; AUC, area under curve. 
The abbreviations are the same as the following figures. 
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Fig. 3 An example of dust storm detection in the Sistan watershed from MODIS satellite imagery with FCC 
(false color combination) 


during field visits, where we observed poor vegetation cover, visible plant roots, fine soil particles 
prone to wind erosion and flat terrains (Hamoun Lake) that help wind to reach the threshold 
velocity enabling dust emissions. 

The inventory map of dust source areas in the Sistan watershed is shown in Figure 5. A total of 
211 dust source areas were identified, in which 70% (147 dust source areas) were randomly 
selected as modeling training points and 30% (64 dust source areas) as validation points for the 
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model evaluation. The training and validation points were distributed rather equally along the 
plain areas in the Sistan watershed, apart from the northeastern mountainous and rocky areas 
(Hazarat Mountains in central Afghanistan), where dust sources were very scarce or even absent 
(Fig. 5). 
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Fig. 4 Locations (a) and photos (b—e) of field visits in the Iranian territory of the Sistan watershed from 12 to 16 
July 2019. The figures b—e are the 1—4 observation sites in Figure 4a. 


3.2 Spatial maps of dust source generated by two predictive models 


After determining the weight of all factors affecting dust source area and multiplying it with the 
classes of influential factors, the weight maps are combined and the final maps of potential for 
dust source areas based on the models WOE and FR were obtained. Then, we classified the maps 
into 4 classes based on naturals break method in ArcGIS: low, medium, high and very high 
potentials (Fig. 6a and b). According to FR model (Fig. 6a), we detected the very high potential 
class at the topographic-lower Sistan watershed, as expected, covered by the desiccated Hamoun 
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Fig. 5 Inventory map with distribution of dust sources divided into training points (70%, red) and validation 
points (30%, green) 


Lake and the dried beds of Helmand River at the southern part of the study region (Rashki et al., 
2013; Evenstar et al., 2018; Mianabadi et al., 2021). Therefore, the results show that areas 
associated with wetlands or rivers are more potential to dust emissions, due to often desiccation 
and/or dryness during long drought periods (Xi and Sokolik, 2018; Gholami et al., 2020b, c). On 
the other hand, the high and very high potential classes cover mostly the central part of the Sistan 
watershed, while moderate and low dust potential classes represent the mountainous northeastern 
part of the study region (Hazarat Mountains). In general, the results from WOE model (Fig. 6b) 
present a similar spatial distribution of the potential classes, but highly overestimate the very high 
potential class, which covered nearly the whole central part of the study region and not only the 
Hamoun Lake and the dried beds of Helmand River, as predicted by FR model. Steep, rocky and 
arid mountains in the Iranian part of the study region (westernmost part) are characterized by low 
and moderate potential classes, indicating that the spatial distribution of the classes is highly 
related to the topography and slope characteristics, apart from the low NDVI. As shown in the 
analysis, the very high potential class is strongly related to specific soil characteristics like Orthic 
Solonchaks, mostly around the Hamoun Lakeand Cambie Arenosols along the Helmand River. 
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Fig. 6 Dust source potential maps (DSPM) produced by FR (a) and WOE (b) models in the Sistan watershed 


Table 1 summarizes the areas corresponding to each dust source potential class and the fraction 
of the dust sources belonging in each class for the two models. The results of FR model showed 
that 24.4% of the Sistan watershed is characterized as low potential for dust sources, 20.3% 
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corresponds to moderate potential, 40.8% to high potential, while 14.5% of the Sistan watershed 
exhibits very high potential for dust sources. The results of WOE model present notable 
differences, especially in the determination of high and very high potential classes, since most 
areas with the high potential in FR model are appeared in very high potential class in WOE (Fig. 
6). More specifically, 12.8% of the study area is classified as low potential class, 24.6% as 
moderate potential class, 27.1% as high potential class and 35.5% as very high potential class by 
the result of WOE. Some changes in the different model algorithms may misclassify a specific 
range of values (dust source areas) that belong in the margins between high and very high classes 
and, therefore, such changes in the spatial distribution maps obtained by different models may 
arise. However, the critical is that both models present a similarity in the detection of potential 
dust class, since the summary of high and very high classes correspond to 55.3% according to FR 
model and 62.6% according to WOE model of the total area. In addition, 93.5% of dust sources 
are located in areas characterized by high and very high potential classes by the result of FR 
model. In WOE model, more than 95.0% of dust source areas are located in regions characterized 
by high and very high potential classes to dust emissions with different relative fractions between 
high and very high classes compared with FR model (Table 1). Results presented in this study are 
in agreement with those reported by Gholami et al. (2020b), who found that a large part (>80.0%) 
of the Sistan watershed located in Iran, is susceptible to dust emissions and was classified as high 
and very high susceptibility classes. Behrooz et al. (2019) reported that desiccated beds of the 
Hamoun Lake are the dominant sources for dust storms in the Sistan riverside. Overall, desiccated 
beds of the Hamoun Lake, abandoned agricultural lands and bare lands resulted in high and very 
high susceptibility of dust sources in the Sistan watershed. 


Table 1 Percentage of dust source areas for each potential class and percentage of dust sources in the validation 
phase for FR and WOE models 


Model Potential class of dust source Area covered (%) Dust source (%) 
Low 24.4 0.00 
Moderate 20.3 6.35 
FR : 
High 40.8 50.79 
Very high 14.5 42.86 
Low 12.8 0.00 
Moderate 24.6 4.76 
WOE : 
High 27.1 15.87 
Very high 35.5 79.37 


3.3 Evaluating spatial maps generated by models 


ROC curve was used to evaluate the accuracy of WOE and FR models. The higher the area under 
the curve, the higher the accuracy of the model that varies from 0.5 to 1.0. In general, the 
classification is characterized as excellent for ROC values between 0.9 and 1.0, as very good for 
0.8-0.9, good for 0.70.8, average for 0.6—-0.7 and weak for 0.5—0.6 (Yesilnacar, 2005). As stated 
before, in the present study, 30% of the dust source areas was used for the evaluation phase 
(called forecast rate) and 70% of the dust source areas was used for the modeling phase (called 
success rate). The results from the evaluation of the probabilistic WOE and FR models using 
ROC curve showed that both models exhibited similar values of 0.848 and 0.854, respectively for 
the training phase (Fig. 7a) and 0.825 and 0.831 for the validation phase (Fig. 7b), which are 
classified in the very good category. The training and validation rates indicate that both models are 
acceptable and can be used for satisfactory determination of the potential dust source areas in the 
Sistan watershed. 


3.4 Factors controlling dust sources and assessing relationship between effective factors 
and dust source areas using predictive models 


Spatial distribution maps of soil information, lithology, slope, vegetation index, geomorphology 
units and land use that were introduced as independent variables for the determination of dust 
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Fig. 7 AUC of the receiver operating characteristic (ROC) curve. (a) success rate curve for the dust source 
potential map of FR and WOE models (training phase); (b) forecast rate curve for the dust source potential map 
of FR and WOE models (validation phase). 


source areas in the Sistan watershed are shown in Figure 8. Based on the Figure 8a, 
geomorphological landscapes in the study area include playa, dagg, covered pediment, rock 
pediment, erosion pediment and mountain in the northeast. A great part of the study area is 
consisted of bare land and moderate rangeland, while agricultural areas located in southern 
Hamoun Lake are very limited (Fig. 8b). Furthermore, sand dunes are observed in areas 
downwind of the major dust sources and along the main dust plumes, as shown in Figure 3. In 
view of lithology, most area of the Sistan watershed is covered by sedimentary consolidated and 
non-consolidated rocks (Fig. 8c). In addition, three soil types including Lithosols, Calcic 
Yermosols and Orthic Solonchacks cover a great part of the study area (Fig. 8d), while the later 
dominates in the playas and around the Hamoun Lake, i.e., in the areas that are mostly responsible 
for the dust storms in the Sistan watershed. The whole Sistan watershed had poor and sparse 
vegetation cover, but a small part, including central and northeastern regions, had a NDVI 
value >0.1 (Fig. 8e). The central part, characterized by flat topography and slope <2%, covers 
more than 50% of total area, while steep mountains cover the northeastern part, as well as areas in 
the west (Fig. 8f). Apart from the geological characteristics that are examined here as potential 
factors for dust sources, previous studies in the Sistan watershed revealed that surface wind speed, 
maximum air temperature, relative humidity, Hamoun Lake and the frequency of erosive winds 
constitute the most important factors for predicting dusty days and land susceptibility to wind 
erosions (Ebrahimi-Khusfi et al., 2021). However, the lack of such observational meteorological 
data in the Sistan watershed belonging to Afghanistan, limited the current analysis to geological 
factors. 

Results of the relationship between each of effective factors and dust source areas using FR and 
WOE models are presented in Table 2. Based on the results obtained from WOE model and the 
relationship between land-use parameters and dust source areas, it was found that the highest 
correlation value was found in periodically dry lake and bare land (0.21), while for other land use 
types (e.g., marsh land, residential area, flood plain, salt land and agriculture), correlation was 
reduced to zero or negative, which indicates no association between these land use types and dust 
source areas. In FR model, dried-lake surfaces and irrigated agricultural lands displayed the 
largest coefficients of 2.91 and 1.44, respectively, indicating that irrigated agriculture lands have 
also high potential in being transformed to dust sources after being abandoned, as also found for 
water surfaces (e.g., Hamouns Lake) after desiccation (Rashki et al., 2013). Marsh land, 
residential areas, salt land and flood plain have the lowest values of correlation, indicating lower 
probability of dust sources in these areas. These results are consistent with those of previous 
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Fig. 8 Spatial maps of effective factors on the dust sources. (a) geomorphology; (b) land use; (c), lithology; (d) 
soil; (e) NDVI; (f) slope. The points show the dust source areas (training and validation points). 


studies, which determined the dust sources in different land use types (Lee et al., 2009; Jewell and 
Nicoll., 2011; Crouvi et al., 2012; Lee et al., 2012; Miller et al., 2012). Zobeck et al. (2013) found 
that most of dust source areas are located in poor pastures and rainfed agricultural lands, which is 
consistent with the results in the Sistan watershed. The current results are also in agreement with 
Lindley et al. (2011), who considered poor vegetation and unhardened sediments as important 
factors affecting dust source and wind erosion. 

Results about topographic slope showed that slope class of 0%-2% has the highest weight and 
correlation with dust sources for both models (2.49 and 0.43 for FR and WOE models, 
respectively) indicating that dust source potential decreases with increasing slope. In FR model, 
slopes less than 5%, which correspond to plains and flat areas, have the highest correlation with 
dust source areas, while slopes more than 5% that represent hilly and mountainous areas exhibit a 
lower coefficient with dust source areas. Results of surveying the topographic slope classes using 
WOE model show that only slope class of 0%-2%, with a weight of 0.43 has a positive 
relationship with dust source areas, while slope classes 2%-5%, 5%-8%, 8%-12% and >32% 
display negative relationships. As shown by the distribution of dust source areas on the slope map 
(Fig. 8f), the maximum and minimum points of dust sources distribute in the slopes from 0% to 
2% and more than 32%, respectively. In slope class of 0%-2%, there are 107 points, which cover 
50.71% of dust source areas and in slope class of more than 32%, only 6 points (2.80%) exist. 
The results of distribution of dust source areas on geological map of the Sistan watershed indicate 
that the highest dust source area distribute on the soil of Orthic Solonchaks, which is in the 
category of soils of arid and desert areas (Aridi Sol), with 129 dust points, corresponding to 
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Table 2 Relationship between each of the effective factors and dust source areas using FR and WOE models 


Number of Number of Dust source 


Factor pixels in class Cay) dust source (%) rE YOR 
Land use 
Moderate rangeland 32,868.54 21.96 6 2.84 0.13 —0.98 
Agriculture 5912.27 3.95 12 5.69 1.44 —0.98 
Bare land 101,275.60 67.67 172 81.52 1.20 0.21 
Marsh land 1266.97 0.85 0 0.00 0.00 0.00 
Sand dune 3960.49 2.65 5 2.37 0.90 0.00 
Water-dry lake 3897.69 2.60 16 7.58 2.91 0.21 
Residental area 113.33 0.08 0 0.00 0.00 0.00 
Flood plain 94.46 0.06 0 0.00 0.00 0.00 
Salt land 278.41 0.19 0 0.00 0.00 0.00 
Lithology 
Sedimentary rocks-consolidated 96,815.97 64.69 111 52.61 0.81 0.54 
Sedimentary rocks-unconsolidated 42,117.53 28.14 98 46.45 1.65 0.54 
Volcano rocks 8087.59 5.40 2 0.95 0.18 0.32 
Intrusive rocks 1452.63 0.97 0 0.00 0.00 0.00 
Metamorphic rocks 1194.05 0.80 0 0.00 0.00 0.00 
Slope 
0%-2% 30,529.54 20.40 107 50.71 2.49 0.43 
2%-5% 23,344.82 15.60 43 20.38 1.31 —0.81 
5%-8% 16,660.65 11.13 18 8.53 0.77 0.81 
8%-12% 16,630.44 11.11 19 9.00 0.81 -1.98 
12%-32% 39,260.29 26.23 18 8.53 0.33 0.00 
>32% 23,242.03 15.53 6 2.84 0.18 -1.98 
Soil 
Calcaric Fluvisols 914.13 0.61 0 0.00 0.00 -1.50 
Calcic Yermosols 66,191.80 44.23 48 22.75 0.51 —0.57 
Cambic Arenosols 6231.48 4.16 9 4.27 1.00 1.02 
Lithosols 42,050.36 28.10 13 6.16 0.22 0.00 
Orthic Solonchaks 31,610.87 21.12 129 61.14 2.88 0.91 
Water bodies 1770.44 1.18 12 5.69 4.43 0.00 
Geomorphology 
Playa 21,037.10 14.06 84 39.81 2.83 1.02 
Clay pan (Dagg) 29,216.91 19.52 63 29.86 1.53 0.44 
Covered pediment 36,726.39 24.54 43 20.38 0.83 0.32 
Rock pediment 36,130.21 24.14 16 7.58 0.31 —0.83 
Erosion pediment 23,359.55 15.61 5 2.37 0.15 -1.32 
Mountain 3197.62 2.14 0 0.00 0.00 -1.32 
NDVI 
—0.4-0.0 148,415.93 99.16 211 100.00 1.01 0.01 
0.0-0.1 1048.86 0.70 0 0.00 0.00 0.00 
0.1-0.2 141.04 0.09 0 0.00 0.00 0.00 
>0.2 61.94 0.04 0 0.00 0.00 0.00 


61.14% of total dust source areas. Other soil types present correlation values less than 1 and/or 
even negative. Similarly, results of WOE model, regarding the relationship between soil 
characteristics and dust source areas, show that the highest values belong to the Orthic 
Solonchaks and Cambic Arenosol classes with coefficients of 0.91 and 1.02, respectively, while 
the lowest, to the Calcaric Fluvisols class with a coefficient of —1.50 (Table 2). 
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Regarding NDVI, areas with sparse vegetation and bare soil exhibit the highest probability of 
dust occurrence and areas with dense vegetation the lowest (Dawelbait and Morari, 2012). Both 
models show that all the weights belong to NDVI class —0.4 to 0.0, in which dust source areas is 
located, while the other classes, which are without dust sources, have not taken any weight. 
Regarding the geomorphological units and types in the study area, according to FR model, it is 
found that the highest values are related to salt playa (desert) and clay pan (Dagg) with the values 
of 2.83 and 1.53, respectively, indicating a high correlation, whereas the lowest value (zero) is 
related to mountain class. Results obtained from WOE model present lower values or even 
negative for certain geomorphology types (Table 2). Regarding WOE model, the highest value of 
dust source areas is related with playa and then clay pan terrains. By examining dust source areas 
on the lithological map of the region, it is found that most of dust sources are related to the unit of 
discontinuous and continuous sediments. In FR model, the highest correlation is between 
discontinuous sediment classes with a coefficient of 1.65. In WOE model, discontinuous and 
continuous sedimentary strata with a coefficient of 0.54 exhibit the highest potential and a direct 
relationship with dust source areas. Furthermore, there is no dust source areas in the metamorphic 
and intrusive rocks class (Table 2). 

The degree of participation (importance) of each factor on dust source areas from both FR and 
WOE models is presented in Table 3. Results show that in both models, soil factor exhibited the 
most effect, followed by geomorphology and slope in FR model and the reverse in WOE model. In 
addition, NDVI and lithology had the least effects in FR and WOE models, respectively (Table 3). 


Table 3 Relative importance of each of effective factors on dust source areas according to FR and WOE models 


Factor FR Factor WOE 
Soil 0.9080 Soil 0.908 
Geomorphology 0.6160 Slope 0.693 
Slope 0.5750 Geomorphology 0.616 
Land use 0.2000 NDVI 0.508 
Lithology 0.1720 Land use 0.200 
NDVI 0.0001 Lithology 0.172 


3.5 Interpretability of predictive models using game theory 


PFIM and SHAP plots show the relative importance of factors controlling dust emissions and 
their contributions to base value and predictions (Fig. 9). Based on PFIM, the relative importance 
of effective controlling factors on dust emissions was as follows: 
slope>NDVI>soil>geomorphology>land use>lithology (Fig. 9a). SHAP values for each feature 
are presented in Figure 9b. The highest and lowest values of SHAP were calculated for soil and 
geomorphology, respectively. Figure 9c combines the feature importance with feature effects. The 
color shows the value of factors from low to high. Based on SHAP, the contribution of factors 
controlling dust emissions was as follows: soil>slope>NDVI>land 
use>lithology>geomorphology. 


4 Conclusions 


The novel contribution of this study is combining indicators extracted from RS, statistic-based 
predictive models and game theory to spatial mapping of land susceptibility to dust emissions in a 
very important dust source area—Sistan watershed-in the borders of Iran and Afghanistan. In order 
to generate the spatial maps of dust source areas, we initially developed the inventory map of dust 
sources by using DSPI based on RS data. Then, two predictive models (FR and WOE) were 
applied to determine the relative importance of factors that control land susceptibility to dust 
emissions in the Sistan watershed. The model results were validated using ROC method and AUC 
indicator, while game theory (PFIM and SHAP) was applied to assess the interpretability of 
predictive models. Based on both models, soil, slope and geomorphology were found as the most 
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Fig. 9 (a) relative importance of factors controlling dust emissions by PFIM (permutation feature importance 
measure); (b) contribution of base value for factors controlling dust emissions by SHAP; (c) SHAP value for each 
factor. 


important topographic factors controlling dust emissions in the study area. According to the 
results of both predictive models, >55% of the study area was categorized into high and very high 
susceptibility classes. The results from models' evaluation using ROC method showed that both 
models have high and very similar performance. Based on SHAP, three factors consisting slope, 
NDVI and soil had the most impact on model's predictions. Overall, our modelling approach is 
recommended to generate spatial maps of different environmental hazards, especially wind 
erosion and predict land susceptibility to dust emissions over dusty areas with active wind 
erosion. 
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